11 research outputs found

    From virtual demonstration to real-world manipulation using LSTM and MDN

    Full text link
    Robots assisting the disabled or elderly must perform complex manipulation tasks and must adapt to the home environment and preferences of their user. Learning from demonstration is a promising choice, that would allow the non-technical user to teach the robot different tasks. However, collecting demonstrations in the home environment of a disabled user is time consuming, disruptive to the comfort of the user, and presents safety challenges. It would be desirable to perform the demonstrations in a virtual environment. In this paper we describe a solution to the challenging problem of behavior transfer from virtual demonstration to a physical robot. The virtual demonstrations are used to train a deep neural network based controller, which is using a Long Short Term Memory (LSTM) recurrent neural network to generate trajectories. The training process uses a Mixture Density Network (MDN) to calculate an error signal suitable for the multimodal nature of demonstrations. The controller learned in the virtual environment is transferred to a physical robot (a Rethink Robotics Baxter). An off-the-shelf vision component is used to substitute for geometric knowledge available in the simulation and an inverse kinematics module is used to allow the Baxter to enact the trajectory. Our experimental studies validate the three contributions of the paper: (1) the controller learned from virtual demonstrations can be used to successfully perform the manipulation tasks on a physical robot, (2) the LSTM+MDN architectural choice outperforms other choices, such as the use of feedforward networks and mean-squared error based training signals and (3) allowing imperfect demonstrations in the training set also allows the controller to learn how to correct its manipulation mistakes

    Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-To-End Learning from Demonstration

    Full text link
    We propose a technique for multi-task learning from demonstration that trains the controller of a low-cost robotic arm to accomplish several complex picking and placing tasks, as well as non-prehensile manipulation. The controller is a recurrent neural network using raw images as input and generating robot arm trajectories, with the parameters shared across the tasks. The controller also combines VAE-GAN-based reconstruction with autoregressive multimodal action prediction. Our results demonstrate that it is possible to learn complex manipulation tasks, such as picking up a towel, wiping an object, and depositing the towel to its previous position, entirely from raw images with direct behavior cloning. We show that weight sharing and reconstruction-based regularization substantially improve generalization and robustness, and training on multiple tasks simultaneously increases the success rate on all tasks

    Task Focused Robotic Imitation Learning

    Get PDF
    For many years, successful applications of robotics were the domain of controlled environments, such as industrial assembly lines. Such environments are custom designed for the convenience of the robot and separated from human operators. In recent years, advances in artificial intelligence, in particular, deep learning and computer vision, allowed researchers to successfully demonstrate robots that operate in unstructured environments and directly interact with humans. One of the major applications of such robots is in assistive robotics. For instance, a wheelchair mounted robotic arm can help disabled users in the performance of activities of daily living (ADLs) such as feeding and personal grooming. Early systems relied entirely on the control of the human operator, something that is difficult to accomplish by a user with motor and/or cognitive disabilities. In this dissertation, we are describing research results that advance the field of assistive robotics. The overall goal is to improve the ability of the wheelchair / robotic arm assembly to help the user with the performance of the ADLs by requiring only high-level commands from the user. Let us consider an ADL involving the manipulation of an object in the user\u27s home. This task can be naturally decomposed into two components: the movement of the wheelchair in such a way that the manipulator can conveniently grasp the object and the movement of the manipulator itself. This dissertation we provide an approach for addressing the challenge of finding the position appropriate for the required manipulation. We introduce the ease-of-reach score (ERS), a metric that quantifies the preferences for the positioning of the base while taking into consideration the shape and position of obstacles and clutter in the environment. As the brute force computation of ERS is computationally expensive, we propose a machine learning approach to estimate the ERS based on features and characteristics of the obstacles. This dissertation addresses the second component as well, the ability of the robotic arm to manipulate objects. Recent work in end-to-end learning of robotic manipulation had demonstrated that a deep learning-based controller of vision-enabled robotic arms can be thought to manipulate objects from a moderate number of demonstrations. However, the current state of the art systems are limited in robustness to physical and visual disturbances and do not generalize well to new objects. We describe new techniques based on task-focused attention that show significant improvement in the robustness of manipulation and performance in clutter

    Vision-Based Multi-Task Manipulation For Inexpensive Robots Using End-To-End Learning From Demonstration

    No full text
    We propose a technique for multi-task learning from demonstration that trains the controller of a low-cost robotic arm to accomplish several complex picking and placing tasks, as well as non-prehensile manipulation. The controller is a recurrent neural network using raw images as input and generating robot arm trajectories, with the parameters shared across the tasks. The controller also combines VAE-GAN-based reconstruction with autoregressive multimodal action prediction. Our results demonstrate that it is possible to learn complex manipulation tasks, such as picking up a towel, wiping an object, and depositing the towel to its previous position, entirely from raw images with direct behavior cloning. We show that weight sharing and reconstruction-based regularization substantially improve generalization and robustness, and training on multiple tasks simultaneously increases the success rate on all tasks

    A Real-Time Technique For Positioning A Wheelchair-Mounted Robotic Arm For Household Manipulation Tasks

    No full text
    Wheelchair mounted robotic arms can help people with disabilities perform their activities of daily living (ADL). The autonomy of such a system can range from full manual control (both wheelchair and robotic arm controlled by the human) to fully autonomous (with both the wheelchair and the robotic arm under autonomous control). Many ADLs require the robot to pick up an object from a cluttered environment - such as a glass of water from a table where several other objects exist. In this paper, we concentrate on the task of finding the optimal position of the base of the robotic arm (which is normally a rigid point on the wheelchair) such that the end effector can easily reach the target (regardless whether this is done through human or robot control). We introduce the ease-of-reach score ERS, a metric quantifying the preferences for the positioning of the base. As the brute force computation of ERS is computationally expensive, we propose an approach of estimating the ERS through a mixture of Gaussians. The parameters of the component Gaussians are learned offline and depend on the nature of the environment such as properties of the the obstacles. Simulation results show that the estimated ERS closely matches the actual value and the speed of estimation is fast enough for real-time operation

    Real-Time Placement Of A Wheelchair-Mounted Robotic Arm

    No full text
    Picking up an object with a wheelchair mounted robotic arm can be decomposed into a wheelchair navigation task designed to position the robotic arm such that the object is \u27easy to reach\u27, and the actual grasp performed by the robotic arm. A convenient definition of the notion of ease of reach can be given by creating a score (ERS) that relies on the number of distinct ways the object can be picked up from a given location. Unfortunately, the accurate calculation of ERS must rely on repeating the path planning process for every candidate position and grasp type, in the presence of obstacles. In this paper we use the bootstrap aggregation over hand-crafted, domain specific features to learn a model for the estimation of ERS. In a simulation study, we show that the estimated ERS closely matches the actual value and the speed of estimation is fast enough for real-time operation, even in the presence of a large number of obstacles in the scene

    Trajectory Adaptation Of Robot Arms For Head-Pose Dependent Assistive Tasks

    No full text
    Assistive robots promise to increase the autonomy of disabled or elderly people by facilitating the performance of Activities of Daily Living (ADLs). Learning from Demonstration (LfD) has emerged as one of the most promising approaches for teaching robots tasks that are difficult to formalize. LfD learns by requiring the operator to demonstrate one or several times the execution of the task on the given hardware. Unfortunately, many ADLs such as personal grooming, feeding or reading depend on the head pose of the assisted human. Trajectories learned using LfD would become useless or dangerous if applied naively in a situation with a different head pose. In this paper we propose and experimentally validate a method to adapt the trajectories learned using LfD to the current head pose (position and orientation) and movement of the head of the assisted user
    corecore